Introduction

The main goal of my thesis work is to create a model using presence/absence data to determine Gila monster (Heloderma suspectum) occupancy and detection probability in the Mojave desert in southern Utah. We conducted 500 time-constrained 1-km transect surveys across 6 sites in St. George, UT to collect data for this analysis. Along a transect, surveyors collected environmental and landscape data every 0.25 km (including start and end points).

We created a “Rockiness Index” on a scale from 1 - 7 to quantify the variablity in substrate texture along a transect:

1: Sandstone
2: Rocky-Cliff
3: Rocky-Debris Slope
4: Gravel-Small Rock
5: Stabilized Sand
6: Loose Sand
7: Fine Sand


For this exercise I am using a binary GLM to determine if there is a relationship between the probabilty of detecting a Gila monster during a survey and the substrate texture (as defined by the Rockiness Index). Addtionally creating an equation which best predicts Gila monster detection in a specific substrate type can inform future surveys for Gila monsters which are extremely difficult to find.

Plots

First I created a generalized linear model plot and a linear model plot, then I used the patchwork library to view the plots side-by-side.

plot.glm <-
  ggplot(gilas, aes(rock_ind, detection)) +    
  geom_jitter(size=2, width = 0.25) +       # jitter helps to see all data points; width for spread     
  geom_smooth(method="glm",
              fullrange = TRUE,                                       
              method.args=list(family="binomial"(link="logit"))) +
  labs(title="GM GLM Fit") +
  ylab ("Probability of Detection") +
  xlab ("Substrate Texture Index")
plot.lm <-
  ggplot(gilas, aes(rock_ind, detection)) +
  geom_jitter(size=2, width = 0.25) +
  geom_smooth(method = "lm",
              fullrange= TRUE) +
  labs(title = "GM Linear Fit") +
  ylab("Probability of Detection") +
  xlab("Substrate Texture Index")
plot.lm + plot.glm
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'

Once I visualized the relationship I created and ran the model to look at the slope and intercept.

model_rockiness <- glm(detection ~ rock_ind, data = gilas, family = binomial)
model_rockiness
## 
## Call:  glm(formula = detection ~ rock_ind, family = binomial, data = gilas)
## 
## Coefficients:
## (Intercept)     rock_ind  
##      -13.57         1.76  
## 
## Degrees of Freedom: 2520 Total (i.e. Null);  2519 Residual
## Null Deviance:       242.9 
## Residual Deviance: 187.4     AIC: 191.4

To plot the model on the logit scale, I created a new column of logit fit “rockiness” data.

new_gilas <- data.frame(rock_ind = seq(0, 7, length.out = 100))
new_gilas$logit_fit <- predict(model_rockiness, newdata = new_gilas, type = "link") 

I also adjusted the detection presence/absence data to avoid a blank plot (i.e., log(0) or log(1)).

gilas$detection_adj <- (gilas$detection + 0.5) / 2
gilas$logit_obs <- log(gilas$detection_adj / (1 - gilas$detection_adj))

Then I plot the model on the logit scale:

ggplot() +
  geom_point(data = gilas, aes(x = rock_ind, y = logit_obs), color = "black", size = 3, alpha = 0.8) +
  geom_line(data = new_gilas, aes(x = rock_ind, y = logit_fit), color = "orange", linewidth = 1.2) +
  labs(
    title = "Substrate Model on Logit Scale",
    subtitle = "intercept = -13.57, slope = 1.76",
    x = "Substrate Texture Index",
    y = "Logit(Probability of Detection)")

Model Interpretation

Next I wanted to look at the binned residuals and interpret the summary statistics.

x <- predict(model_rockiness)
y <- resid(model_rockiness)
binnedplot(x,y)

My residuals look like this because the number of detections (21) is so small that the model cannot calculate standard errors. But I still want to interpret the slope to determine the relationship between the substrate type and detection probablity

summary(model_rockiness)
## 
## Call:
## glm(formula = detection ~ rock_ind, family = binomial, data = gilas)
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -13.5671     1.5790  -8.592  < 2e-16 ***
## rock_ind      1.7599     0.2737   6.430 1.27e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 242.92  on 2520  degrees of freedom
## Residual deviance: 187.39  on 2519  degrees of freedom
## AIC: 191.39
## 
## Number of Fisher Scoring iterations: 9
confint(model_rockiness)
## Waiting for profiling to be done...
##                  2.5 %     97.5 %
## (Intercept) -16.946121 -10.716701
## rock_ind      1.252094   2.332588
coef(model_rockiness)
## (Intercept)    rock_ind 
##  -13.567122    1.759908

Looking at the summary table, we can see that substrate texture does have a significant positive effect on detection probability. The p-value is < 0.001 and the residual deviance is less than the degrees of freedom. The confidence intervals also do not overlap 0, so we can be confident that the relationship is positive.

I estimated the maximum predicted effect of substrate type on detection probability by dividing the slope (1.759) by 4 (max predicted effect = 0.439). In this case, since my predictor variable is ordinal data (an index) I can interpret this to mean that as the substrate type “increases” from 1 - 7 (rockiest to sandiest) the probability of detecting a Gila monster increases by 44%.

Biologically this makes sense because as a surveyor your chance of detecting a Gila monster or sign of a Gila will be greater in sandy washes or patches of sand where tracks can be left behind. Versus rocky cliffs or large sandstone structures where Gilas may be present but significantly harder to detect.